Analysis of Data Cleansing Approaches regarding Dirty Data - A Comparative Study
نویسندگان
چکیده
Data Cleansing is an activity involving a process of detecting and correcting the errors and inconsistencies in data warehouse. It deals with identification of corrupt and duplicate data inherent in the data sets of a data warehouse to enhance the quality of data. The research was directed at investigating some existing approaches and frameworks to data cleansing. That attempted to solve the data cleansing problem and came up with their strengths and weaknesses which led to the identification of gabs in those frameworks and approaches. A comparative analysis of the four frameworks was conducted and by using standard testing parameters a proposed feature was discussed to fit in the gaps.
منابع مشابه
A Conceptual Framework for Data Cleansing – A Novel Approach to Support the Cleansing Process
Data errors occur in various ways when data is transferred from one point to the other. These data errors occur not necessarily from the formation/insertion of data but are developed and transformed when transferred from one process to another along the information chain within the data warehouse infrastructure. The main focus for this study is to conceptualize the data cleansing process from d...
متن کاملA Review of Data Cleansing Concepts – Achievable Goals and Limitations
Data Cleansing is an activity involving a process of detecting and correcting the errors and inconsistencies in data warehouse. It deals with identification of corrupt and duplicate data inherent in the data sets of a data warehouse to enhance the quality of data. The study looked into investigating some research works conducted in the area of data cleansing. A thorough review into these existi...
متن کاملApplying Ordinal Association Rules for Cleansing Data With Missing Values
Cleansing data of errors is an important processing step particularly when integrating heterogeneous data sources. Dirty data files are prevalent in data warehouses because of incorrect or missing data values, inconsistent attribute naming conventions or incomplete information. This paper improves the data cleansing ordinal association rules technique by proposing a solution for the missing val...
متن کاملCleansing and preparation of data for statistical analysis: A step necessary in oral health sciences research
In many published articles, there is still no mention of quality control processes, which might be an indication of the insufficient importance the researchers attach to undertaking or reporting such processes. However, quality control of data is one of the most important steps in research projects. Lack of sufficient attention to quality control of data might have a detrimental effect on the r...
متن کاملبررسی رفتار و سازه سبکهای حل مسئله نیروی انسانی دانش گرا
Foundations of creativity and innovation will be strengthened in the higher education sector only when approaches to settling in the issue of manpower are identified and geared toward appropriate behaviors. The present article studies the existing ways to solve the problem using four different approaches (sentimental, emotional, logical, and perceptual)and 32 relevant structures to come up wit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013